Schroedinger-like PageRank equation and localization in the WWW 
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The World Wide Web is one of the most important communication systems we use in our everyday 
life. Despite its central role, the growth and the development of the WWW is not controlled by 
any central authority. This situation has created a huge ensemble of connections whose complexity 
can be fruitfully described and quantified by network theory [ll, [2:HS]- One important application 
to sort out the information present in these connections is given by the PageRank alghorithm[3]. 
Computation of this quantity is usually made iteratively with a large use of computational time. 
In this paper we show that the PageRank can be expressed in terms of a wave function obeying a 
Schrodinger-like equation. In particular the topological disorder given by the unbalance of outgoing 
and ingoing links between pages, induces wave function and potential structuring. This allows to 
directly localize the pages with the largest score. Through this new representation we can now 
compute the PageRank without iterative techniques. For most of the cases of interest our method is 
faster than the original one. Our results also clarify the role of topology in the diffusion of information 
within complex networks. The whole approach opens the possibility to novel techniques inspired by 
quantum physics for the analysis of the WWW properties. 

PACS numbers: 71.23.-k,05.10.-a,89.75.Fb 
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Most of the work in the field of complex networks 
has been related to the analysis of particular datasets 
and their modelling through suitably chosen statistical 
growth models [l|, '^j '6f| . A more recent approach of par- 
ticular interest is the study of the physical processes tak- 
ing place on a network and interacting with its topol- 
ogy [TI, Iql as, for example, the diffusion on a complex 
network. The large variability in the number of neigh- 
bours that characterises complex networks produces un- 
expected behaviour in the propagation of viruses and dis- 
eases [9|. The same topological randomness makes also 
unpredictable the behaviour of the PageRank Q . 

This quantity (firstly introduced by the Google search 
engine founders) measures the importance of a page in 
the entire system [5| (see Appendix). The present proce- 
dure to compute this quantity on a given graph is based 
on numerical iterations. The whole process corresponds 
to surf the WWW with a number of random surfers and 
to assign the PageRank of a page as the number of times 
it is crossed by a walker. Just to avoid the presence of 
trapping states for these surfers, they are allowed to jump 
into a completely new page of the WWW (with proba- 
bility {I — a) ^^0.15). The role of the damping factor 
a has been intensively studied, since PageRank changes 
significantly when a is modified ill| . 
The effectiveness of this ranking procedure is witnessed 
by the success of the search engine Google. Since the first 
investigations, various procedures and algorithms have 
been presented to determine the PageRank and its dy- 
namical evolution [l^, U^, [iJ, [lJ| . It is fair to say that 



the study of PageRank constitutes a field of research on 
its own. Actually the PageRank has also a strong impact 
in may Web-related topics such as propagation of trust 
(e.g. TrustRank, propagation of trust and distrust), and 
smoothed classification of pages. 

We present here a novel approach based on an analogy 
with quantum physics. In particular we show that it is 
possible to rearrange the terms of the PageRank equa- 
tion in such a way as to obtain a Schrodinger-like equa- 
tion for a wave function. For a given page, the wave 
function is given by the PageRank divided by the out- 
degree (i.e., the number of outgoing links). Through this 
new approach we recognize that the topology of the sys- 
tem, (more precisely the difference between the number 
of ingoing and outgoing links per page) , plays the role of 
a local quantum-like potential V defined on every page. 
Two basic assumptions underlie these results. Firstly, 
in the Schrodinger equation for the WWW the spatial 
derivatives in the Laplacian operator are expressed in 
the discrete space of a graph [1^ . Secondly, we adopt a 
directed version of this operator to account for the real 
topological structure of the Web (see Appendix for a dis- 
cussion on this point). 

While the first operator in the new equation can be con- 
sidered a Laplacian operator, the second operator V plays 
the role of a local potential (given by the out-degree of 
a page divided by the parameter a minus the in-degree 
of the same page). This is a crucial point, since through 
the use of this potential, we can distinguish between the 
case of complete reciprocity [1^1 between nodes and the 




shell 



FIG. 1; Top a pictorial representation of the Page Rank equa- 
tion and its counterpart in term of wave function. Bottom 3-d 
plot of potential V and the corresponding PageRank mea- 
sured along concentric shells around an original vertex. 



directed case. In the former case we show that the wave 
function must be constant everywhere. When reciprocity 
is broken, we have that a non trivial PageRank distribu- 
tion is attained. A final constant term is also present 
and it is given by (1 — a)/aN where N is the total num- 
ber of pages in the system. With respect to the stan- 
dard Schrodinger equation, this constant term describe 
the behaviour of a source. Every page creates a min- 
imum quantity of PageRank that accumulates in most 
linked (successful) pages. Even before considering the 
particular cases in which the equation can be analyti- 
cally solved it is possible to understand the "physical" 
meaning of these various terms. The discrete Laplacian 
directed operator accounts for the diffusive nature of the 
PageRank, and the potential accounts for the effects of 
the topology in the diffusion process. The constant term 
that vanishes when a tends to 1 has the form of a source 
of charge and it is related to the possibility that after a 
large jump, a walker can be originated in the page. In 
Figure [T] we show the potential V and the correspond- 
ing PageRank measured along concentric shells around 
an original vertex. 

Following this physical analogy, some limit situations 
can be immediately understood. Consider the case in 
which the network is undirected, this means that the 
in-degree of every page is equal to the out-degree. No 
trapping states are present and we can therefore consider 
a = 1. The Schrodinger equation becomes then a Laplace 
equation whose solution is given by a constant wave func- 
tion. We recover then the result that the PageRank in 
this case is proportional to the degree of the page. This 



clarifies the role of the potential: it is the disorder given 
by the absence of reciprocity in the links that creates the 
non trivial properties of the PageRank observed in the 
WWW. When V is different than and displays a min- 
imum, (as in the case of an electron wave function in a 
disordered medium) the wave function is localized into 
the well of such a potential. In Figure [5] we show some 
examples of the topology of the potential in proximity 
of some hubs and the corresponding wave- functions; the 
wave function is highly peaked in correspondence of the 
minima of V. Similarly, those nodes that have a low 
PageRank tend to be located in proximity of the maxi- 
mum of the potential. It is important to stress that, since 
the Web has not a simple topology, the fact that a page is 
minimum or a maximum of the potential is only evident 
when plotting the values at the nearest neighbours in the 
network. By exploiting the topology information in the 
potential V, one is able to gain information about the 
PageRank spatial distribution. In particular, according 
to our picture, the higher scores of the PageRank will 
condensate within the potential wells. As a very first ap- 
proximation one can consider pages corresponding to the 
smallest values of V. A simple analysis on the a subset of 
the WWW (the 2005 .eu domain [l3|) shows that by con- 
sidering the 100 minimal values of the potential V one 
can guess (irrespective of the order) 61 among the top 
100 values of PageRank. To improve this result (without 
considering the general solution for ijj), one would have 
to take into account gradients and expansions around the 
local V minima. Actually our interpretation suggests a 
relatively simple way to compute the whole distribution 
of the PageRank (independently from the physical inter- 
pretation of the directed Laplacian operator). In princi- 
ple once the matrices of the Laplacian operator and the 
potential operator are known, the wave function (and 
henceforth the set of PageRank values) could be com- 
puted by inverting these operators. This simple opera- 
tion is unfeasible when the size of the matrix is given by 
the tens of billions of pages composing the WWW. Here 
instead we adopt a different approach based on matrix 
expansion that in principle could allow also to study the 
time evolution. The result of this matrix expansion are 
plotted against the value of PageRank (computed in the 
traditional way onto the same subset of the WWW cited 
above) is shown in Figure [31 One can increase as desired 
the order of the expansion with a computational cost that 
increases only linearly with the order (see Appendix for 
details). 

In conclusion, rearranging the terms in the PageRank 
equation, we have been able to obtain a Schrodinger-like 
equation allowing us to define a site potential V in the 
system. According to the meaning of the original equa- 
tion, the PageRank (now in the form of a wave func- 
tion), localizes in the minima of this potential allowing 
educated guesses on the distribution of PageRank score. 
Furthermore, this representation makes possible to com- 
pute the value of PageRank score through matrix expan- 
sion rather than iteratively. While in the limit of infinite 
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FIG. 2: Contour plot of potential and wave function for two 
opposite cases of attractive (a,b) and repulsive (c,d) potential. 
As before they are computed over concentric shells of neigh- 
bours computed through a BFS algorithm. In both cases the 
peak of the potential are produced by the topological disor- 
der given by the difference between in-degree and out-degree 
of a node. The continuous line is obtained by averaging over 
nodes on the same shell. 



FIG. 3; A plot of the convergence of the expansion for the 
PageRank. On the x-axis we have the PageRank computed 
in the traditional iterative way, on the y-axis the PageRank 
computed through our proposed expansion. Every vertex cor- 
responds to a different page. Increasing the order of the ex- 
pansion the points condensate on the curve y — x. This 
process occurs very quickly for the top 100 ranking sites of 
the WWW portion we studied. 



pages the latter method is certainly less time consuming, 
for the subset of the WWW analysed here the expansion 
method is noticeably faster. Since the PageRank is also 
used for a variety of Web-related problems (like for ex- 
ample the propagation of trust), in most of these cases 
where one deals with systems of medium-large size, our 
approach can be competitive. Finally, the formal anal- 
ogy with quantum physics makes available a complete 
series of theoretical frameworks like perturbation theory 
that could be used to study the dynamical evolution of 
the PageRank score distribution. While with iterative 
techniques we must compute anew the PageRank score 
distribution for any change of the topology, through our 
approach we can in principle obtain in a much shorter 
time the evolution of this quantity. The Schrodinger-like 
equation we introduce, formally generates a connection 
among completely different kind of phenomena and at 
very distant scales, like the World Wide Web and the 
microscopic world of Quantum Physics and it is likely to 
pave the way to a series of activities in this field. 



I. APPENDIX 



A. PageRank and wave equation 



by: 



The equation for the PageRank Pi of a page i is given 



^» = "E§ + (i-")^ 






(1) 



where a is a parameter tuned to 0.85 in the first appli- 
cation made on WWW and N is the total number of 



pages in the system. The quantity k^ indicates the out- 
degree of page j . We can modify the above equation by 
introducing the quantity 






(2) 



defined everywhere apart the dangling nodes for which 

kf = 0. 

The above equation then becomes 

a \-^ (1 — a) 1 

With a little algebra we obtain 



^* = -p^ II ^^■ 



ky^7' ■ k? N 



(3) 
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Yl ^j ~ '^^^i 



J-ft 



H , (1-a) 1 ,,, 



where he quantity /c/ indicates the in-degree of page i. 
Rearranging the terms with ij^i we obtain 



a 



Yl ^i ~ ^l^i 



j^i 



kl \ (1 - a) 1 

k? ; ^' k? N 



multiplying both sides for k~ /a we finally obtain 

(1-a) 1 



where we have defined 



a N 



(5) 
(6) 



^c{i^^) = YiJ,-kl4,, and V,= '^^-^ (7) 
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Complete reciprocal lattice (SC) 
1 % edges removed randomly 
5% edges removed randomly 
10% edges removed randomly 
20% edges removed randomly 
30% edges removed randomly 



C. Expansion of the Wave Function 
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FIG. 4: Plot of the electrostatic potential ^, obeying the equa- 
tion At/) = in a 2-d simple cube lattice and in a Barabasi- 
Albert network. In the simple cubic case the upper and lower 
layer are kept at the fixed value of t/j = and t/j = 1 respec- 
tively. The average value increases from to 1, even when the 
reciprocity in the links between pairs of nodes is broken up to 
30% of the completely reciprocal case. In the inset the plot 
of the same quantity for a Barabasi- Albert model where the 
same boundary conditions of i/) = 1 and ^ = are applied to 
the leaves and to the core of the structure respectively. Here 
four shells are considered starting from a node in the core and 
the behaviour is the same even for a 20% removal of reciprocal 
links 



The function Ac is the directed counterpart of the action 
of the Laplacian operator A acting on a vector whose N 
elements are the ^i. Passing to the vectorial representa- 
tion eq. (O becomes 



{-Ac + V)^P^F 



(8) 



where y is a diagonal matrix whose elements are given 
by the Vi and whose effect on the vector is to multiply the 
element ■0i with Vi and F is a vector whose all elements 
have the value (1 — a)/{aN). 



The solution of the above eq. ([H 



is formally given by 
F (9) 



but for matrices of the size of the WWW (~ 10"'^'' nodes) 
this solution cannot be used since it would be impossible 
to invert the matrix. We can make then use of matrix 
expansion through the following passages 



ip = (I -y-^Ac 



y-^F 



(10) 



the standard approach would be to use the following ex- 
pansion 

oo 

(/-y-iAc)-^ = ^(V-^Ac)" (11) 

11=0 

provided all the eigenvalues A^ of (y~^Ac) are \Xh\ < 1. 
This expansion allows to invert only the diagonal matrix 
V (that can be done easily by taking the inverse of the 
elements on the diagonal) . While the physical meaning is 
clear in this form, it is mathematically more convenient 
to simplify the expressions above by simplifying both the 
operator Ac and V by putting eq. ([9]) in the simpler 
form 



V- 



aA' 



F' = {I-aBy^{k°) ^ F' (12) 



where F' = aF ^ k'-' is a matrix whose elements are all 

zero apart on the diagonal where they are given by the 

outdegree of vertices and B ~ {k^) ^ ■ 

This is a form that closely resemble the original equation 

for PageRank, with the important caveat that we are 

now working with the wave function i]). In this case the 

expansion: 



B. Role of Directed Laplacian Operator 

To clarify the role of A with respect to Ac we made 
some simple numerical tests. The traditional Laplacian 
operator describes a variety of different phenomena as 
for example the distribution of electrostatic potential in 
a dielectric when an electric field is applied to it (as for 
example by putting it into a capacitor). In this case 
the potential varies with the distance with the plates of 
the capacitor. We computed exactly this field in a series 
of lattices (both regular as the simple cube and some 
realizations of Barabasi- Albert network) starting from a 
completely reciprocal case and deleting randomly some of 
the connections. Generally, provided that the proportion 
of reciprocal links is above the percolation threshold for 
the lattice considered, the statistical behaviour of the 
directed Laplacian is the same of the non-directed one 
(See Figure E]). 



(/-a5)-^ = ^(aS)" 



(13) 



n=0 



does converge and we can calculate with the desired pre- 
cision f/) and so the associated Page Rank. As a prelimi- 
nary result this expansion outperformed by a factor three 
in speed the caclulation of the top 100 of the .eu subset 
domain of the WWW. 
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